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Using a matciied filter technique, we derive the minimum variance, unbiased estimator for the 
equilibrium displacement of a damped harmonic oscillator in thermal equilibrium when interactions 
with the thermal bath are the leading source of noise. We compare the variance in this optimal 
estimator with the variance in other, commonly used estimators in the presence of pure thermal 
noise and pure white noise. We also compare the variance in these estimators for a mixture of 
white and thermal noise. This result has implications for experimental design and the collection 
and analysis of data. 
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I. INTRODUCTION 

The torsion pendulum is currently used in a number of 
experimental programs to test theories of gravity (jjy and 
references therein). This work involves the detection of 
extremely small torques and requires the experimental- 
ist to design measurements whose precision approaches 
the fundamental limit posed by thermal noise prescribed 
by optimal (minimum variance, unbiased) statistical es- 
timation techniques. This requirement is familiar to the 
community engaged in these studies, and those readers 
will immediately ask why we need still another treatment 
of thermal noise on a damped harmonic oscillator. To an- 
swer this question, we begin with a simple example that 
illustrates a major shortcoming in customary methodolo- 
gies. 

Consider a linear oscillator in thermal equilibrium with 
a heat bath at absolute temperature, T. The equilibrium 
displacement of the pendulum, c, can be estimated by 
measuring the instantaneous displacement of the oscilla- 
tor, x{t), at t = 



x(0) 



(1) 



where the circumflex indicates a parameter estimate. 
The ensemble of such estimates is a random variable (the 
estimator), and it is represented by Cms- We will use 
the convention that a capital letter represents an ensem- 
ble and a lower-case letter represents a realization of the 
ensemble. The equipartition theorem prescribes the vari- 
ance of the instantaneous estimator 



var(Ci„s ) = = a , 

K 



(2) 



where kb is the Boltzmann constant and k is the torsional 
spring constant. 



If the data used in a parameter estimate is a continuous 
time series, x{t), one can define another estimator that 
has smaller variance than the instantaneous estimator. A 
familiar approach, the "boxcar" estimate, is an average 
of the displacement of the oscillator over the time series 
starting at t = and ending at i = r, 



Cbo 



1 



x{t)dt. 



(3) 



For the case of an oscillator dominated by thermal noise, 
one can calculate the variance of the boxcar estimator 
using the Fourier methods presented later in this article 

var(C6o:,) = -37^^ + u'^ + l-fujT + 27wV 

+ e-^^{{Z-i^uj - J") cos(wt) + {r" - 37^2) sin(tJT))) 

(4) 
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where uj^ — ^ K/m is the undamped oscillation fre- 
quency, 7 is the decay coefficient, and u — ^Jojq — ^'^ 
is the damped oscillation frequency. The quality fac- 
tor, Q = iOr/i2"f), is traditionally defined in terms of 
the resonant frequency, ujr = V^o ~ ^7^1 but to simplify 
equations appearing later, it is convenient to define an 
alternate quality factor, Qq = llJo/{2j), in terms of the 
undamped frequency. Similarly, we state the duration of 
the data sample in units of undamped oscillation periods, 
N = LUoT/i2T:). 

The solid curve in Figure ^ shows the logarithm of 
var{Cbox) / o''^ versus the logarithm of N for an oscilla- 
tor of quality factor, Qq = 50. It is apparent that the 
boxcar estimate is not optimal for the thermal-noise- 
dominated pendulum because the variance does not de- 
crease monotonically with increasing sample duration r. 
Adding more data cannot degrade the optimal estimate 
of a parameter. What then is the optimal estimate, Cop? 

We assert that the dashed curve in FigureHrepresents 
the variance of the optimal estimator, var ( Cop )/(T^. This 
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FIG. 1: The solid curve is the variance of the boxcar estimator 
of the equilibrium displacement, in units of the variance of 
the instantaneous estimator, versus the duration of the data 
sample in undamped oscillator periods. The oscillator has 
a quality factor of Qo = 50. The dashed curve is a plot of 
the variance of the optimal estimator, also in terms of the 
variance of the instantaneous estimator and for Qo = 50. 
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where 



This result completely describes the stochastic time evo- 
lution of a damped harmonic oscillator in contact with 
a heat bath. For a theorist, the problem is solved. And 
in a limited sense, this assessment is correct. Proba- 
bility theory, which produces equation (jSJ, attempts to 
characterize the measured values one would obtain given 
the parameters of the system. A statistical approach, on 
the other hand, is concerned with the inverse problem: 
to determine a measurement /inference scheme that pro- 
vides optimal estimates for the relevant parameters. As 
such, a statistical characterization is of keen interest for 
experimentalists because it provides insight into both the 
design of the experiment and analysis of the experimental 
data. 



curve is qualitatively consistent with what one would ex- 
pect from an optimal estimator — it does decrease mono- 
tonically with increasing r, and it lies on or below the 
boxcar estimator for all values of r. In this paper, we de- 
rive a closed-form expression for Cop under fairly general 
assumptions. The point of presenting this example is that 
up to now a solution to this basic estimation problem has 
not appeared in the experimental literature. This is the 
gap we wish to fill with this article. 

It is somewhat surprising that as we approach the 
lOO"' anniversary of Einstein's seminal work on Brow- 
nian motion there remain several, arguably canoni- 
cal, questions whose answers are not widely known in 
the physics community. Aside from the large number of 
people who have looked at the problem, several notable 
minds have studied Brownian motion on a damped har- 
monic oscillator. For example, in Chandrasekhar's 1943 
Reviews of Modern Physics article, "Stochastic Problems 
in Physics and Astronomy" Q , one learns that given the 
initial displacement and velocity, xq and vq, of a damped 
harmonic oscillator at t = 0, the probability distribution 
function for the displacement x at time i > is 



(5) 



(6) 



I 

To answer typical statistical questions that experimen- 
talists wish to ask regarding the damped harmonic oscil- 
lator, the autocovariance function of the stationary ther- 
mal noise ensemble provides sufficient information. Its 
form 

{5XtUt)SXth{t + At)) 

= a^e-^l'^*! (cos(cj|At|) + (^) sin(cj|Ai|)) 

is much simpler than (O because of the time-translation 
invariance of the stationary noise ensemble. Using either 
© or (0 one can calculate the variance of a particular 
estimator, but neither equation alone yields the minimum 
variance, unbiased estimator. 



W{x,t; xo,vq) = 



exp 



mix — xqc 



0t/2 (^cosh(%^) +|-sinh(%* 



|fe-'5*/2sinh(%^) 



* 1 e~'^* 



2^2^ ^ 2(3^ sinh^ (l^]+f]f3, sinh(/3it) + (3^ 



(3 = 2^. 
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The question of optimal estimation has been stud- 
ied extensively. In particular, because of applications 
to radar, optimal filter theory was intensely developed 
during World War II. For stationary noise processes, an- 
alyzing statistical estimation in the Fourier basis greatly 
simplifies the problem because the noise fluctuations in 
various Fourier components are not correlated one with 
another. The power spectrum of the thermal noise en- 
semble corresponding to the autocovariace shown in |(7J, 



S[SX, 



th\ 



(8) 



is a Fourier representation containing the same informa- 
tion. Basically, optimal filter theory states that the op- 
timal estimate is a weighted least-squares sum in Fourier 
space with the weights being determined by the signal- 
to-noise ratios of the various Fourier components. 

No detailed mathematical derivation is needed to ob- 
tain an optimal estimator for the constant c. Since the 
deflection parameterized by c has only a zero-frequency 
Fourier component, the optimal estimator, Cop, will also 
have only a zero-frequency component. Thus, for any sta- 
tionary noise process, optimal filter theory dictates that 
Cop is the boxcar estimator. Yet, according to the dis- 
cussion of Figure ^ it would appear that optimal filter 
theory produces the wrong result. 

This discrepancy arises because there are certain as- 
sumptions that must be satisfied in order for optimal 
filter theory to be valid. Chief among these assumptions 
is that the discrete Fourier components of the noise from 
finite duration data samples should be a good approxi- 
mation to the continuous Fourier components of the noise 
from infinite duration data samples. For N ^ Qq, where 
N is the number of undamped oscillation periods, this re- 
quirement is satisfied, and Figure^shows that var(C'box) 
does indeed approach var(C'op) for N ^ 50. For N < Qo, 
the narrowband thermal noise of a damped harmonic os- 
cillator does not satisfy the above requirement, and op- 
timal filter theory is not valid in that regime. When 
performing atomic force cantilever experiments for which 
the resonant frequency is measured in kiloHertz and the 
characteristic damping time, I/7, is measured in seconds, 
waiting for iV ^ Qo is a realistic possibility. For torsion 
balance experiments with milHHertz resonant frequen- 
cies and characteristic damping times of weeks, however, 
waiting for N ^ Qq in order to simplify the data analy- 
sis is clearly impractical. The majority of torsion balance 
experiments are conducted in the "bumpy" regime of the 
boxcar estimator in Figure ^ where optimal filter theory 
fails most miserably. 

Insight to the character of this problem has been 
suggested by Priestly 3 in the treatment of a related 
question^. Moreover, a formal solution was posed 
by Grenander |||, but we find that it provides the 
physicist with neither a great deal of physical insight 
nor a straightforward means of translating the results 



into equations involving measurements and the physical 
model. We therefore construct our derivation with meth- 
ods and tools more familiar to the experimental commu- 
nity and refer the interested reader to Grenander for a 
rigorous mathematical development. 

Having set out the question, we now present the an- 
swer. The optimal estimate of deflection is 



Qqujot 



(9) 



where Xi = x{0), Xf = x{t), Vi — v{0), Vf = v{t), and 
Xm is the boxcar estimate (PJ. The variance of the cor- 
responding estimator 



var((7, 



2a^ 



op J 



Qoujqt 



(10) 



is the smallest possible for an unbiased estimator of c 
constrained to using data of duration t. The dashed line 
in Figure n is a plot of (|10|l for Qq = 50. The derivation 
and discussion of @ and (|10|l are the major topics of this 
paper. It is not immediately apparent how these results 
follow from either {7)) or ||SJ), but the simplicity of 110|) 
implies that symmetries and appropriate transformations 
streamline the solution. The complexity of suggests 
the problem can become difficult if these symmetries are 
ignored. 

In section^]we present the estimation of linear param- 
eters in the presence of noise and show how to calculate 
the variance in the parameter estimators using the spec- 
tral power density of the noise process in the fundamental 
observable. We then derive the minimum variance, un- 
biased estimator of the equilibrium displacement of the 
torsion pendulum in the presence of white noise and ther- 
mal noise in section Hill Section Hvl examines the effects 
of multiple noise processes on the variance of different 
estimators. In particular we examine a superposition of 
white noise and thermal noise as well as transients caused 
by nonthermal disturbances to the oscillator. 



II. LINEAR PARAMETERS 

A. Estimating Linear Parameters 

A realization of data x(t) is a combination of the phys- 
ical signal, denoted x{t] p) as explained shortly, and a 
term representing additive noise 



x{t) = x{t;p) +5x{t). 



(11) 



We use the convention that a Greek letter p represents 
the physical value of the corresponding parameter p upon 
which the signal depends. A linear parameter is esti- 
mated by projecting a realization of the data x{t) onto 
an estimating function ep{t) 



P 



ep(t)x{t)dt. 



(12) 
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In the absence of noise, the data matches the physical 
signal and an unbiased estimating function returns the 
physical value of the parameter 



ep{t)x{t; p)dt. 



(13) 



Nearly any time domain filter fp{t) can be normalized 
to create a valid estimating function by requiring that the 
relation (|13|l be satisfied. Thus, for linear parameters, the 
filter fp(t) gives the estimating function 



fpit) 



JZofp{t)x{t;l)dt 



(14) 



where x(t] 1) is the unit-amplitude signal. The only re- 
striction on fp{t) is that must not be orthogonal to the 
signal. 



B. Calculating the Variance of Linear Parameter 
Estimates 



The variance in a parameter estimator is found using 
the estimating function and the autocovariance operator 



var(P)= f f ep{ti){SX{ti)SX{t2))ep{t2)dtidt2, 
Jo Jo 

(15) 

where we recall that capital letters represent ensembles. 
This time-domain representation, however, is not neces- 
sarily the most convenient or intuitively appealing for- 
mulation of the variance. For some noise processes, the 
Fourier basis is superior, yielding the expression 

var(P) = -/ F^[epit);iy]S[5Xity,i,]di,, (16) 
^ Jo 

where we denote functionals or linear operators with 
square brackets. The spectral power density of the noise, 
S[6X{t);i'], is given by 



SiSXity,!^] = 2 1 {6X{t)5X{t')) cos(27ri/(t - t'))dt', 



(17) 

and F'^[ep{t); f], which we call the Fourier energy density 
of the estimating function, is 

F2[ep(0;H = {Fc[ep{tyi^]f + {Fs[ep{ty,iy])\ (18) 

where 



/oo 
ep{t) cos{2Tivt)dt 
-OO 

Fsiepit);^] = V2 ep{t) sin{2'Kvt)dt 



(19) 



are the cosine and sine transforms of the estimating func- 
tion. We choose the normalization of (|19|l to preserve 
Parseval's relation, 



F'^[ep{t);v]dv 



{ep{t)Ydt. 



(20) 



Throughout the remainder of this paper we will often 
drop the explicit dependence on u of the Fourier energy 
density (FED) of the estimating function and the spec- 
tral power density (SPD) of the noise process and the 
dependence on t of the data and of the estimating func- 
tion. 



C. Power Density of Stationary Noise Processes 

The construction of the SPD in 117|l requires that the 
time-dependent autocovariance operator be known a pri- 
ori. In many instances a construction that uses the spec- 
tral information about a noise process is more transpar- 
ent. We present and discuss our preferred definition of 
the noise SPD in this section. 

The inverse transform of equation (|17|l is equal to the 
autocovariance operator 



S[5X] cos(27ri/(i - t'))dv = {6X{t)5X{t')) . (21) 



The special case where t = t' gives the instantaneous 
variance of the noise ensemble 



S[5X]dv = (^{SX{t)f'^ = vSir{SX{t)) 



(22) 



and shows that the SPD characterizes the contribution to 
the estimator variance from the noise contained in each 
infinitesimal frequency bin. Our preferred definition of 
the noise SPD is therefore 



S[SX;iy]= lim Av{ {Fc[SX;v ± Aiy/2]Y 



{Fs[6X;,y±A,y/2]f) (23) 



lim Aiy{F^\SX:i^±Ai^/2 



where 



1 

FclSxit);^ ± Avl2\ = V2 5x{t) cos{27rvt)dt 



Fs[Sx{t);v±Aiy/2] = V2 



(24) 



Sx{t) sm{2TTi't)dt 



are the finite bandwidth Fourier components of a noise 
realization. 

The relations H24|l are an essential aspect of a Fourier 
definition of the SPD because th e Fo urier components of 
a noise realization diverge as 1/ V Av ~ -y/r in the limit as 
Ai/ — !■ for a continuous noise spectrum. Furthermore, if 
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a realization of the noise were used in (|23() instead of the 
ensemble, S[6x; v\ = HmAi/-»o Az/i^^[(5x; Azy/2], while 
remaining finite, would not converge. Thus, the SPD is 
a property of the noise ensemble, not of any particular 
noise realization. 



D. Examples of Stationary Noise Processes 

We now calculate the variance in a parameter estima- 
tor due to the influence of three distinct stationary noise 
processes, monochromatic noise, white noise, and ther- 
mal noise. We consider in detail the monochromatic case 
because it gives insight to parameter estimation using 
Fourier techniques and because it provides a straightfor- 
ward way to verify the mutual consistency of the coeffi- 
cients in the variance equation (jKil) , in the two definitions 
of the SPD (|17() and (|23|1 . and in the normalization of the 
Fourier transforms ()19|) and (|24|l . 



1. Monochromatic Noise 

Consider the effect of an additive monochromatic noise 
component with amplitude e, frequency i^qj random 
phase on a physical signal. A realization of the data is 



x(t) — x{t; p) + e cos(27ri'oi 
The parameter estimate is then 

fixdt 



(25) 



-oo 



ep {x{t; p) + e cos(27ri/ot — 4>)) dt 



P + ~^J^c [cp; vq\ cos -I- [ejs; vq\ sin 0, 



(26) 



an ensemble of which (each with random phase) consti- 
tutes the parameter estimator. The variance of this esti- 
mator follows 

var(A) = ((A - pf) 



1 e 
2^ 2" 



2 ^2tx 

\ (-Fc [fip ; I'o] COS -t- Fs [ep ; I'o] sin < 
Jo 



-F^ [gp; i/o], 



(27) 



where the s subscript denotes the single frequency noise 
model. 

The variance in the ensemble of noise realizations is 

2 

var((5X,) = {|(ecos(27ri/oi - $))^^ = '—■ (28) 

A substitution of H28|) into (|22|) requires the monochro- 
matic SPD to be 



This result, when substituted into equation H16(l . gives 
the same value for the parameter variance as the direct 
calculation, H27(l . and shows that the coefficient of 1/2 in 
(|16|l is consistent with our normalization convention for 
the Fourier transforms, (|19|l and (|24l) . 

A rigorous derivation of the monochromatic SPD, us- 
ing the definition (|23(l . gives the same result as (|29|l 

S{5Xs\ v\ = ^lim^ Ai/ (F\8Xs\ v ± Aj//2]) 



X lim ■ 



A / 2 I 2\ ( ■ "2 ( ■k(v — iiq)\ I ■ 2 ( 7r(i/+i/Q)\ 
+ \^ ' ) +sm ' ) 



(30) 

where the second term, 5{y + vq), was dropped because 
the convention adopted in this paper does not allow neg- 
ative frequencies. This derivation shows the consistency 
of the two definitions of the SPD, ^ and it^ . 



2. White Noise 

Another example of a stationary noise process is white 
noise, with equal power at all frequencies, 

S[5Xwh] = constant = 77. (31) 

This eliminates all two-point time correlations giving the 
autocovariance operator 

{5X^H{ti)5X^H(t2)) = ^S{ti - t2). (32) 

Although ideal white noise yields infinite power, for this 
paper we restrict our attention to calculations for which 
no non-physical results occur for pure white noise. The 
variance in a parameter estimator may be written as 



(29) 



poo 

/ F^[ep]diy (33) 
Jo 

(ep)^di, 

where Parseval's relation was invoked in the final step. 

3. Thermal Noise 

The spectral power density of thermal noise is obtained 
beginning with the full equation of motion of the oscilla- 
tor 

m^+^^^+K)xit)^m (34) 
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where T{t) is the thermal driving force. According to 
the fluctuation dissipation theorem, the SPD of the driv- 
ing force associated with the thermal bath is a constant 
The response of the oscillator to this white 
driving force gives the SPD in displacement (cf. 10)) 



S[5X, 



th\ 



(35) 



The autocovariance operator is (cf. iQ). 



{5Xth{t)5Xth{t ^ 



-At)) 

(cos{uj\At\) 



sin(w| At| 



(36) 



The integral of the displacement SPD for thermal noise, 
unlike the total power of displacement white noise is finite 
and is related to the expectation value for the potential 
energy of the oscillator 



var{SX, 



th) 



S[SXth]diy : 



2(P.E. 



(37) 

In section lTlI BI wc find the optimal filter and calculate the 
parameter variance for a torsion balance in the presence 
of thermal noise. 




Frequency 



FIG. 2; Cartoon of a Lorentzian and several sinc^iivvT) FED 
curves, each corresponding to a different observation time, as 
a function of frequency. The thick solid curve is a Lorentzian 
peak with Qo = 7.5 (chosen for cosmetic reasons). The dash- 
two-dot FED curve corresponds to an observation time of 
1/10 of a period, the dash-dot curve is for 1/2 a period, the 
dashed curve is for 1.25 periods, the dotted is for 1.8 pe- 
riods, and the thin-line solid curve is for 8 periods. We see 
that, while the 1.25 period observation nulls much of the noise 
from the Lorentzian peak, the 1.8 period observation allows a 
contribution from the peak. The 8 period observation yields 
no significant contribution from the peak as the largest rel- 
ative maxima of the FED have already passed the peak and 
instead allow contributions from the constant portion of the 
Lorentzian. 



E. The Structure of Figure [T] 

We now have the tools needed to describe the structure 
of Figure ^ the variance in the boxcar estimator of the 
equilibrium displacement of the oscillator as a function of 
the sample time. The estimating function for the boxcar 
is 



"^{t)^Q{t;0,T)- 

T 



(38) 



where 0(t;ti,<2) = S{t — ti) — 0(t — ^2) is the boxcar 
function and 6{t) is the Heavyside (step) function. The 
square of the Fourier transform of this estimating func- 
tion gives the FED, 2 sinc^ (tti/t) , 



'2 Tnwh . 



sin(7rj/T) 



(39) 



The SPD of thermal noise acting on the oscillator is 
a Lorentzian H35|) . which for a high Q oscillator peaks 
sharply near the resonance frequency. 

When viewed as a function of frequency, the relative 
maxima and minima of the FED become more densely 
spaced as the observation time r grows. In contrast, the 
Lorentzian peak does not depend upon the observation 
time and remains fixed. Since the variance IjlGI) is propor- 
tional to the integral of the product of the SPD and the 
FED, the portions of the SPD near the minima of the 
FED contribute very little while the portions near the 
maxima, particularly the central maximum, contribute 



more significantly. Figure |21 is a cartoon of a sample 
Lorentzian peak and several FEDs, each with different 
sample times, as a function of frequency. 

We see that for very short sample times all of the 
Lorentzian noise contributes in nearly equal amounts be- 
cause the sinc^ {TTiyT) envelope is nearly constant. As 
the sample time increases, the high frequency noise con- 
tributes less to the variance. Eventually, the relative min- 
ima and maxima of the FED pass through the Lorentzian 
peak and the corresponding relative minima and max- 
ima of Figure n occur (primarily between 1 and 1/Qo 
periods). Finally, for sufficiently long sample times, the 
central peak and several relative maxima of the FED are 
so close to the origin that the constant, low- frequency 
portion of the Lorentzian dominates the variance. 



III. DERIVATION OF THE OPTIMAL FILTERS 

Given that nearly any filter can be normalized to cre- 
ate an estimating function, we use the calculus of varia- 
tion to find the optimal filter, /?^(<), that provides the 
minimum variance, unbiased estimator. This procedure 
requires constraints on the duration of the data sample 
that are difficult to express in the Fourier basis. As will 
be seen below, working in the time domain overcomes 
this challenge for white noise. In section lTlI BI we address 
this problem for thermal noise. 
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A. Optimal Filters for White Noise 

Consider the variance in a parameter estimate for the 
oscillator in the presence of white noise and with data 
from a continuous time series of length r, 



var(A«;i) = ^ / ^pd,t. 



(40) 



We take the variation in this equation under the con- 
straint (|13|l . imposed by introducing a Lagrange multi- 
plier A, to obtain 

(5(var(A«/i)) 

epx{t; p)dt - p^^ ^^^^ 
{r/ep — \x{t; p)) Sepdt. 



= S{^J ejdt - A 



Requiring that this variation be zero for all time shows 
that the optimal estimating function is proportional to 
the signal, x{t; p). 

A common expression of this filter, often called the 
"matched" filter, has the form of the unit-amplitude sig- 
nal 

/?-"(t)^e(i;0,T)x(t;l). (42) 

Normalizing the matched filter yields the optimal esti- 
mating function for white noise 



^ e(t;0,T)a;(t;l) 
f^{x{t- l)fdt' 



(43) 



The boxcar function, 0, must be included in order to cal- 
culate the FED of the estimating function when working 
in the frequency domain. 



B. Optimal Filters for Thermal Noise 

Since the autocovariance operator for thermal noise 
{Tj) is not diagonal, the optimal filter for thermal noise is 
more challenging to find, as the operator must be diago- 
nalizcd. To accomplish this, we first apply the equation- 
of-motion operator 17 to the data to obtain the thermal 
driving force 



n[x{t)] = m 



'dt^ 



e^ + «).(i). 



(44) 



Because the driving force T = il[Xth{t)\ is a white noise 
process with spectral power density AksTi,, it has the 
same diagonal covariance operator as white noise H32fl 
but with r/ replaced by AksT^: 



{n[Xth{ti)]n[Xth{t2)]) - {n[sxth{ti)M5Xth{t2)]) 

= 2kBTCS{t2-ti). 



(45) 



Thus, when working in the acceleration basis (the basis 
of the thermal driving force), the matched filter provides 
the miminum variance estimator. We define 



Zp{t) = eit;0,T)n[x{t;l)]. 



(46) 



to be the matched filter in the acceleration basis. Note 
that the transformation to the acceleration basis removes 
information about the boundary conditions. This loss of 
information is considered later. We normalize and apply 
(|46|l to the stochastic driving force to find the parameter 
estimate 



JZ,zpn[x{t)]dt 

JZ,zpn[x{t;l)]dt- 



(47) 



We now use zp{t) to find a corresponding filter fp{t) 
that can be applied directly to the displacement data by 
requiring 



/oo />oo 
fpxdt = / zpQ[x]dt 



(48) 



for all realizations of x. Integrating by parts yields the 
solution, 



zpn[x]dt — / zp {mx + ^x + Kx) dt 



zp [mx -\- ^x) 



rt^[zp]xdt, 



{zp{mx + ^x) + KZpx) dt 
+ I [mzp — ^ip + KZp) xdt 



(49) 



where we introduce the transpose equation-of-motion op- 
erator 



cT d^ d 
it — m— 7 — t — 
dt^ ^ dt 



(50) 



Thus, the optimal filter for thermal noise in the displace- 
ment basis obtained by a transformation from the accel- 
eration basis is 

f°''{t) = [zp{t)] = [e(t; O, r)n [x{t; 1)]] (51) 

where the oa superscript denotes that it is the opti- 
mal acceleration filter. Since fl^ acts on the Heavyside 
functions, fp°'{t) can contain terms involving Dirac delta 
functions and their derivatives. Normalizing this filter 
gives the optimal estimating function 



n'^ [Q{t;0,T)n[x{t;l)]] 



x{t; [Bit; 0, r)f] [x{t; 1)]] dt 



(52) 



where we include the infinitesimal e to avoid ambiguity 
regarding how the denominator is evaluated. 



8 



C. Estimating the Equilibrium Displacement 



We now calculate the optimal filter and resulting pa- 
rameter estimate for the equilibrium displacement of a 
thermally perturbed oscillator. The results of this section 
are valid only for estimating a single parameter. A sub- 
sequent paper will cover the more general case of several 
parameters. We assume that the viscous drag coefficient 
^ and the torsional spring constant k, or equivalently the 
damping coefficient 7 and the frequency wq, are known. 

The optimal filter for c is 

/r(i;c) = f^^ [e(t;0,T)r![x(i;l)]] 



m'uol {e{t) - e{t - t)) - 2^{5{t) - 5{t - t)) 



+ \{5\t)-6'{t-T))] 
^0 / 



(53) 



where 5' {t — to) is the time derivative of the delta func- 
tion. We normalize this filter to obtain the optimal esti- 
mating function for the equilibrium displacement of the 
oscillator, 
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(54) 



which yields the parameter estimate 



Xf - Xi + QoLOaTXm + Qo ^^^p"' ^ 



(55) 



Qoujqt 



The variance of the estimator corresponding to H55|l is 
easiest to calculate in the acceleration basis. A properly 
normalized filter must satisfy 



eTxdt 



[x] dt 



where 



Qit;0,T) 



(56) 



(57) 



is the result of normalizing the acceleration basis filter 
as in H14(l replacing x{t; 1) with U[x{t; 1)] . The variance 
in the estimator C°° is given by 



var(C'°'^) 



1 
2 

Qqujot' 



F^[y]S[n[SX]]diy 



y^dt 



(58) 
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FIG. 3: The variance of tfie estimator C°". The solid curve 
is using the boxcar estimate (|3J, the dotted line is using the 
optimal acceleration-only estimate l|55^ . and the dashed line 
is the optimal estimate l^. 



The dotted line in Figure |2| shows the variance in the 
estimator (7°" for Qq = 50. We see that this variance 
is indeed monotonic and smaller than the variance from 
the boxcar estimate for sample times larger than about 
0.01 periods, however, for very short sample times the 
boxcar estimate has the smaller variance. This failure 
results from the loss of information about the boundary 
conditions when transforming to the acceleration basis as 
mentioned in section Fill Bl These boundary conditions, 
when properly accounted for, rectify the failure of this 
approach for small sample times. 

The initial conditions are the natural boundary con- 
ditions because causality dictates that they depend only 
on the forces acting prior to the beginning of the sample. 
Moreover, because the driving force on the oscillator is 
white, the force time series before the sample is uncorre- 
lated with that during or after. The initial displacement, 
initial velocity, and the acting forces completely deter- 
mine the displacement of the oscillator. We, therefore, 
write the optimal parameter estimate as linear combina- 
tion of the initial conditions and a (not necessarily opti- 
mal) acceleration estimate 



C°*'' = WiXt +W2—+ W^c". 



(59) 



We wish to determine the choice of the constants wi, W2, 
and as well as the acceleration estimate that will pro- 
duce the overall minimum variance, unbiased estimator. 

Because the initial velocity contains no information 
about the equilibrium displacement, W2 must be zero to 
minimize the variance in the estimator that corresponds 
to the parameter estimate (|59|l . The variance is therefore 

(60) 

The condition that the estimator be unbiased provides 
the constraint 



var(C') = w^var(Xi) -I- wlvariC"). 



Wi + W3 — 1 



(61) 
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and minimizing the total variance establishes that the 
weights Wi and W3 are proportional to the inverse vari- 
ances. The weights are therefore given by 



and 



Wi 



W3 



var(C") 



var(Xi) + var(C°) 
var{Xi) 



var(Xi) + var(C°) 
and the variance simplifies to 



var(C) 



var(Xj)var(C"') 
var(Xi) +var(C'^)' 



(62) 



(63) 



(64) 



This last expression is minimized when the optimal ac- 
celeration estimator, described previously, is used for C". 
The optimal parameter estimate is then 



x,var(C°") + c°^va.r{X^) 

c = 

var(Xj) -I- var(C°'') 



Xi+Xf + QoUJQTXrn + Qo (^^IIJ^) 

2 + Qo^oT 



(65) 



Note that this parameter estimate has the same time- 
reversal symmetry as both the noise and the signal — a 
property that the acceleration-only estimate H55|l does 
not share — and that the weight assigned to the initial 
displacement (|62f) is that which restores the symmetry. 
The variance in the optimal parameter estimator is 



var(C" 



oth \ 



2^2 



2 + Qoujqt 



(66) 



The dashed curve in Figure 13 shows the optimal variance 
as a function of the sample time duration. For short 
time scales, the variance is constant and is dominated by 
the uncertainty in determining the initial displacement 
For long time scales, the variance falls as 1/r and 
is dominated by the fluctuations induced by the thermal 
bath. 

This behavior has implications for the utility of using 
an active feedback mechanism to damp the oscillator in 
an effort to reduce the total variance of the parameter 
estimator. While damping the motion of the oscillator 
does indeed reduce the variance in the estimate of the 
initial displacement, it does not change the variance due 
to thermal excitiations because the thermal driving force 
depends solely upon the temperature of the environment. 
Consequently, the value of using a feedback system de- 
pends upon the relative importance of the instantaneous 
measurement and the acceleration measurement of the 
equilibrium displacement for a particular experiment. In 
many instances only the acceleration estimator is used 
and a feedback mechanism provides no benefit. 



IV. MULTIPLE NOISE PROCESSES 

The noise background of a physical system is generally 
a superposition of several noise processes. Such a combi- 
nation renders the task of finding the optimal estimator 
difficult if not impossible because, among other things, 
the basis in which the noise SPD is diagonal is unknown. 
We investigate the effects that superposed white noise 
or residual transients caused by random, large amplitude 
disturbances to the oscillator have on the variance of sev- 
eral estimators: the boxcar, optimal thermal, and opti- 
mal acceleration estimators, as well as one that we will 
caU the Eot-Wash (EW) estimator. The EW estimator 
is related to the one used by the Eot-Wash experimental 
gravity group at the University of Washington ,8;] . Since 
the Eot-Wash group modulates their signal, their model 
involves several parameters. Multi-parameter estimation 
and a detailed analysis of the estimator used by the Eot- 
Wash group will be covered in a subsequent paper. 



A. Transients 

Nonthermal disturbances to a high Q oscillator may 
prevent the oscillator from ever reaching equilibrium with 
the thermal bath since the relaxation time of transients 
may be longer than the average time between the dis- 
turbances. Because of this, the inclusion of the initial, 
instantaneous displacement estimate, Xi, in the optimal 
estimate (|65ll can cause an increase in the variance of the 
estimator. To overcome this, consider the optimal accel- 
eration estimate in the acceleration basis (|56f) where the 
data time series is solely a transient 



ae cos 



(ujt) + be^^'^ sin(wt)] dt = 0. 

(67) 



Consequently, no transient can contribute to the param- 
eter estimate or the estimator variance when using the 
acceleration estimate provided that the disturbance that 
causes the transient does not occur while the data are 
being aquired. 

This shows that the acceleration estimator is supe- 
rior under certain conditions. To find these conditions 
we calculate the variance in the two estimators when an 
ensemble of random-phase disturbances with maximum 
displacement amplitude e cause a transient. The optimal 
thermal estimator has a variance of (cf. (|50|) and H62|) ') 



var(C°^ 



'th\ 



2a' 



2 + QqijJqt 2 \2 + QoujoT 



(68) 



Under the same conditions, the optimal acceleration es- 
timator has a variance of 



var(C° 



2^2 



(69) 
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FIG. 4; A plot of the normalized variance of the boxcar (di- 
amond), the EW (filled circle), and the optimal (square) es- 
timators in the presence of thermal noise as a function of 
sample time (in terms of the number of periods). The results 
are calculated to first order in 1/Qo. 



FIG. 5: A plot of the normalized variance of the EW (filled 
circle) and boxcar (diamond) estimators in the presence of 
white noise as a function of the sample time in terms of the 
number of periods. 



The acceleration estimate is superior when 

With a high Q oscillator, we see that transients as small 
as the thermal disturbances can render the optimal ther- 
mal estimate inferior to the acceleration estimate. This 
is true even when the sample time is small (r ~ 1/luoQo). 
For this reason, the acceleration estimator is often used 
in lieu of one that accounts for the initial displacement. 



same estimators in the presence of white noise. Because 
the optimal thermal estimator has infinite variance for 
this case, we show in figure the variance in the EW 
estimator compared with the variance of the boxcar in 
the presence of white noise as a function of the sample 
time. 

Not only is the EW estimator robust under these 
changes in the noise background, its variance is more 
immune by a factor of I/Qq the transient signal than 
is the boxcar. To illustrate this property, consider a box- 
car estimate. To leading order in 1/Qo and for an integer 
number of periods 



B. White Noise and Thermal Noise Combined 

1. Edt-Wash Approach 

When additive white noise is present, the use of Xi, 
Xf, Vi, and w/ in the optimal estimate yields infinite vari- 
ance in the optimal thermal estimator. However, if the 
white noise does not dominate, one need not resort to 
the boxcar estimate. The EW estimator is quite robust 
for systems that are dominated by either white noise or 
thermal noise. Its variance is within approximately 1/A^ 
of the optimum in cither case, where N is the number of 
oscillation periods in the data sample. 

The EW approach averages the data with itself delayed 
by half of a period. A boxcar average is then taken for 
an integer number of oscillation periods. The variance 
in the EW estimator in the presence of purely thermal 
noise is less than that of the boxcar estimator. Figure^ 
shows a comparison of the variance of the EW estimator 
with that of the optimal thermal estimate and the boxcar 
estimate as a function of the sample time. We see that 
the variance in the EW estimator is situated between 
the boxcar and optimal estimates and it approaches the 
optimal as roughly 1/A''. The robustness of the EW es- 
timator is manifest when we examine the variance of the 



^owh 



nV 



nV 



nV 



(ae ^* cos(a;t) -I- be s\n{ujt) + c)dt 



c + b^ + o(^ 



(71) 



where V is the period of the damped oscillator. The 
variance in the boxcar estimator, expressed to the same 
order, is 



var(C°"''') 



QoojT 2 \Qo 



(72) 



In order for the fractional increase in variance to be small, 
the amplitude of the transient disturbance, e, must sat- 
isfy 



£2 ^ QQ^a^ 



(73) 



2 nVn 

By comparison, with an extra half-period of data, the 



11 



EW estimate is 



2nV 



nV 



x(t)dt 







2nV 



{n+l/2)V 



V/2 



= c + a- ( - 
2 \uj 



2 / 1 



(74) 



and the variance of the estimator is 
3^2 



var(C"'") = 



£2 ^2 



1 



(75) 



Qoujqt 2 4 \>^o. 
In this small increase in variance need only satisfy 



2 nP7r3 ' 



(76) 



a significant relaxation of the constraint for the boxcar, 
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FIG. 6: Paneled plots of the optimal filters derived from 300 
discrete datum for a mixture of white and thermal noise. The 
top right corner is the optimal filter for pure white noise, top 
left has 10% white, lower left has 1% white, and lower right 
has 0.1% white noise. 



2. Numerical Results 

Because it is generally difficult to transform to a rep- 
resentation in which an arbitrary mixture of noise has a 
white power spectrum, numerical methods are often the 
only option available to reduce the uncertainty in a mea- 
surement due to the estimation technique. To employ 
numerical methods the data is discretized. The optimal 
estimator is then found using generalized least-squares 
analysis 0. As an example, we calculate the optimal es- 
timator using one and one-half periods of data sampled 
at 300 points. For a single linear parameter, the optimal 
parameter estimate is found using a discrete filter given 
by 



(q^ 



m, 



m. 



(77) 



where mx is the noise covariance matrix and q is some- 
times called the design vector. The design vector is given 
by the partial derivative of the parameterized data with 
respect to the parameter at each time step 



^ dc 



(78) 



For the equilibrium displacement of the oscillator, each 
component of the design vector is unity. The data is mul- 
tiplied by the filter H77|l to give the parameter estimate. 

To investigate the changes in the optimal filter as the 
noise background changes from pure white noise to pure 
thermal noise, we normalized the noise covariance matri- 
ces for white and thermal noise so that, with one and one 
half periods of data, the EW estimator has unit variance. 
We then combine some fraction of each of the covariance 
matrices so that the sum of the admixture coefficients 
is unity. Figure |S1 shows an interpolation of the optimal 
estimating vector for different mixtures of noise. We see 
that the optimal filter starts as a boxcar for pure white 



noise and approaches the combination of a boxcar with 
Dirac delta function derivatives (|65|l as the fraction of 
white noise is decreased. 

We evaluated the variance of the optimal estimator and 
compared it with the unity variance of the EW estimator 
for several noise mixtures. For the case of pure white 
noise, the variance in the optimal estimator is 89% of the 
variance in the EW estimator. The optimal estimator 
variance is 90% of the EW estimator variance for 10% 
white noise, 84% for 1% white noise, and 80% for 0.1% 
white noise. For pure thermal noise (not shown), the 
variance in the optimal estimator is 70% of that in the 
EW estimator. This analysis is valid for a mixture of only 
white noise and thermal noise; transient signals were not 
included. Filters such as those shown in figure El are not 
generally immune to transient signals. This fact again 
illustrates the robustness of the EW estimator because, 
in the variance, transient signals are only manifest at 
fourth order in 1/Qq. 



V. DISCUSSION 

Equation 1)65(1 defines the minimum variance, unbiased 
estimator for the equilibrium displacement of a damped 
harmonic oscillator when statistical fluctuations in ther- 
mal equilibrium are the dominant source of noise. In 
deriving this estimate we chose to transform the observ- 
able to the acceleration basis in which the thermal noise 
spectral power density has a diagonal form (equal noise 
power at all frequencies). Once in this "white noise" ba- 
sis, the minimum variance estimator is determined by 
application of the matched filter. A subsequent trans- 
formation of this estimator back into the displacement 
representation gives our result. 

This closed-form solution is of great advantage to the 
experimentalist. Such a solution for any noise process 
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serves to guide the design of an experimental apparatus 
and the methods used to gather and reduce the experi- 
mental data. The corresponding solution for white noise, 
the boxcar estimator, has been used extensively as an 
optimal estimator under proper criteria or as a point of 
departure for the construction of an appropriate estima- 
tor. 

One example is the EW estimator which is robust and 
near optimum under the restriction that the data sample 
is a half-integer number of periods in duration. In a lab- 
oratory such a requirement can often be met, but there 
are circumstances where this criteron is either inefficient, 
not feasable, or may be entirely beyond the control of the 
observer as is the case in relevant astrophysical scenar- 
ios. In such situations the EW estimator would fail to be 
near optimum and possibly fail to be defined (e.g. if only 
a single half period of data is given). Since the EW es- 
timator does not generally apply, one might be tempted 
to resort to the boxcar estimator. According to figure ^ 
the penalty for such a choice can be an increase in vari- 
ance by a factor of order Q. Such an increase can occur 
when the assumptions implicit in formulating optimal es- 
timates, like the boxcar and optimal thermal estimates, 
are not satisfied. 

When both white and thermal noise processes are 
present, neither solution is appropriate. Moreover, the 
method used in section IIIII to minimize the variance 
(transforming to a diagonal representation) may not be 
possible. Under certain circumstances one may find an 
estimator that is relatively immune to combinations of 
noise, such as the EW estimator. More generally, the 
only practical option is to discretize the data and use 
least-squares methods to find the optimal estimator nu- 
merically. In such situations, the interpretation of the 
numerical results may not be obvious and the closed form 
solution can provide appropriate guidance (c.f. figurelB)). 

While we have addressed some aspects of random noise 
beyond thermal noise, there are several systematic effects 
that we have neglected. These effects can be roughly di- 
vided into two groups: effects that can be modeled and 
incorporated into the analysis of the data and those that 
cannot. The latter group, which includes such things as 



temperature fluctuations, fiber anelasticity or nonlinear- 
ity, and sudden relaxations of the fiber (fiber quakes), 
will not be discussed in our articles. The former group, 
which includes linear fiber drift, damped oscillations, sig- 
nal modulation, etc. we will discuss. However, incorpo- 
rating these effects into the analysis requires an extension 
of the techniques developed in this paper. In future pub- 
lications we will address simultaneous fitting for several 
linear parameters (for example, to fit for a modulated sig- 
nal or linear fiber drift) and nonlinear parameters (such 
as the oscillation frequency and damping coefficient of 
the oscillator). 

These subsequent papers will also discuss some of the 
implications that the analytic results have on experimen- 
tal design. We have already mentioned at the end of sec- 
tion ^H] that the use of active feedback to damp the mo- 
tion of the oscillator when estimating the equilibrium dis- 
placement is beneficial only if one uses the instantaneous 
estimate ^ when determining the equilibrium displace- 
ment of the oscillator — compare H55|l and (|65|l . Another 
striking fact is revealed when fitting for the oscillation fre- 
quency of the oscillator. We will show that, for thermal 
noise, the optimal estimate of the oscillation frequency 
requires no more than four measurements of the displac- 
ment of the oscillator each period. That is, there is no 
direct benefit from having five or more displacment mea- 
surements for thermal-noise-limited experiments where 
the oscillation frequency is the signal. These two exam- 
ples demonstrate how an analytic expression for optimal 
parameter estimators can have significant implications 
for the design of an experimental apparatus — insights 
that do not emerge from numerical solutions. 
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